104 research outputs found
The iDAI.publication: extracting and linking information in the publications of the German Archaeological Institute (DAI)
We present the results of our attempt to use NLP tools in order to identify named entities in the publications of the Deutsches Archäologisches Institute (DAI) and link the identified locations to entries in the iDAI.gazetteer. Our case study focuses on articles written in German and published in the journal Chiron between 1971 and 2014. We describe the annotation pipeline that starts from the digitized texts published in the new portal of the DAI. We evaluate the performances of geoparsing and NER and test an approach to improve the accuracy of the latter.Il paper descrive i risultati dell’esperimento di applicazione di strumenti di NLP per annotare le Named Entities nelle pubblicazioni del Deutsches Archäologisches Institute (DAI) e collegare i toponimi identificati alle rispettive voci dell’iDAI.gazetteer. Il nostro studio si concentra sugli articoli in tedesco pubblicati nella rivista Chiron tra il 1974 e il 2014. Descriviamo la pipeline di annotazione impiegata per processare gli articoli disponibili nel nuovo portale per le pubblicazioni del DAI. Discutiamo i risultati della valutazione degli script di geoparsing e NER e, infine, proponiamo un approccio per migliorare l’accuratezza in quest’ultimo task
Treebanking in the world of Thucydides. Linguistic annotation for the Hellespont Project
The Hellespont project (DAI, Tufts University) aims to structure the text of a passage from the ancient Greek historian Thucydides (1.89-118), in order to highlight events, persons and peoples that populate the world of the author and connect the different digital sources available for their study. Event annotation in the text in particular requires an in-depth linguistic analysis of morphology, syntax and semantics. However, the available resources for Ancient Greek do not provide adequate standards to support the encoding of semantic and pragmatic phenomena in Ancient Greek texts. In this paper, we discuss the motivation of the project and how we adapted the so called tectogrammatical annotation of the Prague Dependency Treebank to identify the events and describe their structure. The linguistic notion of valency, which is central to tectogrammatical sentence representation, proves very useful for this analysis of Ancient Greek
The Ancient Greek Dependency Treebank: Linguistic Annotation in a Teaching Environment
This chapter argues that manual linguistic annotation of Ancient Greek texts can be effectively employed to teach of Greek literature and languages. Under the supervision of a teacher, students can be engaged into the ongoing creation of the Ancient Greek Dependency Treebank. With the help of one example from Sophocles (Tr. 962\u20133), we will illustrate how the collective work of treebanking in a class environment provides an ideal occasion to discuss the methods of Classical Philology and the history of interpretation of a given passage; more importantly, while producing a treebank annotation, students can learn how to read a complex text in its literary and communicative context following the methods of textual criticism. New and old research questions emerge from the work; at the same time, through the final annotation the students will produce a tangible contribution to a crucial initiative that is likely to change the way Greek grammar will be studied in the future
Nominal vs copular clauses in a diachronic corpus of Ancient Greek historians. A treebank-based analysis
We study the distribution of the nominal and copular construction of predicate nominals in a subset of authors from the Ancient Greek Dependency Treebank (AGDT). We concentrate on the texts of the historians Herodotus, Thucydides (both 5th century BCE) and Polybius (2nd century BCE). The data comprise a sample of 440 sentences (Hdt = 175, Thuc = 91, Pol = 174). We analyze the impact of four features that have been discussed in the literature and can be observed in the annotation of AGDT: (1) order of constituents, (2) part of speech of the subjects, (3) type of clause and (4) length of the clause. Furthermore, we test how the predictive power of these factors varies in time from Herodotus and Thucydides to Polybius with the help of a logistic-regression model. The analysis shows that, contrary to a simplistic opinion, the nominal construction does not drop into irrelevance in Hellenistic Greek. Moreover, an analysis of the distributions in the authors highlights a remarkable continuity in the usage patterns. Further work is needed to improve the predictive power of our logistic-regression model and to integrate more data in view of a more comprehensive quantitative diachronic study
Recommended from our members
“Please, ChatGPT, write me an essay in my own words…”: can teachers really tell their students from bots?
No abstract available
Issues in Building the LiLa Knowledge Base of Interoperable Linguistic Resources for Latin
Purpose: This abstract presents the architecture and the current state of the LiLa
Knowledge Base (https://lila-erc.eu), i.e., a collection of multifarious linguistic resources for Latin described with the same vocabulary of knowledge description, by using
common data categories and ontologies developed by the Linguistic Linked Open Data
(LLOD) community according to the principles of the Linked Data paradigm
Recommended from our members
Generative AI paradigm shift in Higher Education: balancing myths and realities in assessment marking and design
Expanding on the work of Dwivedi et al. (2023) and De Vita et al. (2023), this study investigates the challenges and opportunities associated with the utilization and potential misuse of AI chatbots in the context of higher education, specifically emphasizing the design and implementation of authentic student assessments. These technologies are on the brink of achieving a level of sophistication that will fundamentally transform conventional approaches to student assessments. This study presents an update of the findings initially shared at the Greenwich Business School Learning and Teaching Festival 2023.
Initial academic responses to generative AI were heavily concerned with its role in academic dishonesty and the broader implications for maintaining academic integrity. Nonetheless, this study argues that the strategic adoption of generative AI tools could catalyze the creation of novel, authentic assessment models that fully integrate and capitalize on the advancements in AI. Anticipating that this integration may inspire future students to critically evaluate the role of AI in their academic programs, it could also encourage them to utilize AI tools for enhancing their critical thinking, learning outcomes, employability, and ethical application.
Addressing the challenge of identifying the misuse of AI in academia, the study discusses two main strategies. The first involves the use of AI-powered detection tools, which, despite their promise, come with limitations such as high costs and limited adoption. The second strategy focuses on analyzing linguistic features to detect AI-generated content, emphasizing the importance of educators' familiarity with their students' writing styles.
An experiment is conducted using specific AI chatbots to generate essays, aiming to mimic undergraduate students' writing styles. These AI-generated essays are then compared with actual student submissions from a UK university using computational linguistic analysis. The preliminary findings reveal surprising consistencies in the use of language between the AI-generated and student-written essays, including similar use of word classes and syntactic relations. Interestingly, the AI's writing style shows preferences for longer words and more adjectives, aligning along multiple dimensions with the style of top-graded student essays.
Concluding, the study suggests that as generative AI becomes more entrenched in educational settings, the pedagogical methods employed by educators are likely to face more rigorous examination. Against this backdrop, the authors advocate a forward-looking research agenda in this domain, where the integration of AI in HE should not be seen as a threat but as an opportunity to rethink assessment methods and foster authentic learning experiences. The authors invite further discussion and insights on the future role of AI in the academic landscape
Recommended from our members
“Please, ChatGPT, write me an essay in my own words...”: can teachers really tell their own students from bots?
Building upon the recent contributions of Dwivedi et al. (2023) and De Vita et al. (2023), this paper delivers a critical discussion of the challenges and opportunities associated with the utilization and potential misuse of OpenAI's ChatGPT chatbot in the context of higher education, specifically emphasizing the design and implementation of authentic student assessments. Following a brief evaluation of the impact of early ChatGPT iterations (versions 3.5 and older) on the administration of student evaluations at Greenwich Business School (GBS) in 2023, the authors of this paper posit that current and future versions of generative AI technologies (such as ChatGPT for textual content and DALL-E for visual imagery) are poised to reach levels of sophistication and complexity that have the potential to precipitate a paradigm shift and disruption in "traditional" student assessments. Initial academic discourse surrounding generative AI predominantly pointed at its potential for academic misconduct and subsequent implications for academic integrity. However, the potential to harness the capabilities afforded by generative AI tools may serve as a driving force for the development of authentic assessment paradigms that fully embrace and integrate current and future AI advancements. As future students may inquire whether academic programmes and/or modules "incorporate AI," embracing this technology may inspire them to leverage AI tools to develop their critical thinking skills, learning experiences, employability, and ethical use. The pedagogical practices of educators will likely undergo increased scrutiny as the influence of generative AI permeates academic institutions on a wider scale. Within this context, the authors propose a future research agenda in this field, including potential methodologies applicable within the context of pedagogy in higher education
The Syntax of the Heroes? A Treebank-Based Approach to the Language of the Sophoclean Characters
This paper lays the foundation for a treebank-based studies of the syntax of the characters and choruses in Sophocles. The complete mopho-syntactic annotation encoded in the Ancient Greek and Latin Dependency Treebank (AGLDT), published by the Perseus Project, is used to extract information and statistics on the syntactic constructions from five of the seven extant tragedies of Sophocles (with the exclusion of Philoctetes and Oedipus at Colonus, which are not yet published in the AGLDT). Following the seminal approach applied by J.F. Burrows to the novels of Jane Austen, we investigate the distributions of the 30 most frequent dependency relations between part-of-speech and part-of-speech (like, for instance, noun-adjective or preposition-noun). This program entails a series of crucial methodological questions, concerning both practical and theoretical aspects, that are here discussed in full. By examining some of the most basic statistics used by Burrows, such as the correlation between characters based on the distributions of the constructions, it is already possible to isolate interesting syntactic phenomena that appear to characterize the diction of specific figures, such as Creon in the Antigone, or Electra and the Pedagogue in the Electra
- …